08. Backpropagation- Theory

Backpropagation Theory

Since partial derivatives are the key mathematical concept used in backpropagation, it's important that you feel confident in your ability to calculate them. Once you know how to calculate basic derivatives, calculating partial derivatives is easy to understand.
For more information on partial derivatives use the following link

For calculation purposes in future quizzes of the lesson, you can use the following link as a reference for common derivatives.

In the backpropagation process we minimize the network error slightly with each iteration, by adjusting the weights. The following video will help you understand the mathematical process we use for computing these adjustments.

08 Backpropagation Theory V6 Final

If we look at an arbitrary layer k, we can define the amount by which we change the weights from neuron i to neuron j stemming from layer k as: \Delta W^k_{ij}.

The superscript (k) indicates that the weight connects layer k to layer k+1.

Therefore, the weight update rule for that neuron can be expressed as:

W_{new} = W_{previous} +\Delta W^k_{ij}

Equation 4

The updated value \Delta W_{ij}^k is calculated through the use of the gradient calculation, in the following way:

\Delta W_{ij}^k=\alpha (-\frac{\partial E}{\partial W}), where \alpha is a small positive number called the** Learning Rate**.

Equation 5

From these derivation we can easily see that the weight updates are calculated the by the following equation:

W_{new}= W_{previous} +\alpha (-\frac{\partial E}{\partial W} )

Equation 6

Since many weights determine the network’s output, we can use a vector of the partial derivatives (defined by the Greek letter Nabla \nabla) of the network error - each with respect to a different weight.

W_{new}= W_{previous}+\alpha \nabla_W(-E)

Equation 7

Here you can find other good resources for understanding and tuning the Learning Rate:

The following video is given as a refresher to overfitting . You have already seen this concept in the Training Neural Networks lesson. Feel free to skip it and jump right into the next video.

13 Overfitting Intro V4 Final